# Quantile Ratio Regression Panel – Monte Carlo Simulation

This README describes the R script that implements the Monte Carlo simulations for the **panel quantile ratio regression (QRR) model** used in the paper.

The script is fully self-contained: it generates simulated data, estimates several models, computes performance metrics, and saves the resulting R workspace to a single `.RData` file.

---

## 1. Purpose of the script

The script runs a Monte Carlo experiment to compare different estimators for a panel data model based on **quantile ratio regression**:

- **Unregularized QRR panel estimator** (`qrr_panel`)
- **QRR estimator from the `Qtools` package** (baseline)
- **Naive estimator** based on differences of separate quantile regressions with unit fixed effects
- **Ridge / Empirical Bayes QRR panel estimator** (`qrr_panel_l2` with data-driven lambda)

For each simulation, the script:

1. Generates panel data with individual fixed effects and correlated regressors.
2. Constructs a response variable from a lognormal distribution whose quantile ratio matches the model structure.
3. Fits the four estimators.
4. Stores:
   - Estimated regression coefficients (`beta`)
   - Estimated individual effects (`alpha`) when available
   - Out-of-sample performance metrics based on the log residual ratio error (LRE).

Finally, it summarizes performance via RMSE and average LRE across simulations and saves the entire R session.

---

## 2. Script file

This file contains:

1. Definition of the core estimators:
   - `qrr_panel()`: unregularized FE estimator for the QRR panel model.
   - `qrr_panel_l2()`: ridge-penalized estimator for the QRR panel model.

2. Utility functions:
   - `estimate_tau2()`: estimates the variance component of the fixed effects (between-unit variance) from FE estimates and group sizes.
   - `.as_param_by_sim()`, `.align_to_truth()`,  
     `rmse_per_param_vectruth()`, `rmse_mean_vectruth()`: helpers to compute RMSE per parameter and average RMSE across simulations.
   - Wrapper functions:
     - `alpha_rmse_per_param_simple()`, `alpha_rmse_mean_simple()`
     - `beta_rmse_per_param_simple()`, `beta_rmse_mean_simple()`

3. Monte Carlo setup and DGP:
   - Panel dimensions: `n_id` (number of individuals), `T_per_id` (max time periods).
   - True parameter vector `trueBe` for the regressors.
   - Distribution of individual effects `alpha_true_id` (normal with mean `mu_alpha` and variance `sigma_alpha^2`).
   - Construction of unbalanced panels via `Ti` and logical vector `keep`.
   - Multivariate normal regressors `V1, …, V6` with covariance matrix `Sigma`.

4. Simulation loop (parallelized via `foreach`):
   - Number of simulations: `B`.
   - For each replication `b = 1, …, B`:
     - Generate regressors `X` and true ratio `truer`.
     - For each observation, find a lognormal scale parameter matching the target quantile ratio via `optimise`.
     - Generate responses `y` from the corresponding lognormal distribution.
     - Fit the four estimators and store:
       - `beta_*_full`: full regression coefficient vector (intercept + slopes).
       - `alpha_*`: estimated unit effects (when defined).
       - `LRE_*`: out-of-sample log ratio error for each estimator.

5. Post-processing:
   - Construction of matrices for regression coefficients:
     - `beta_unreg_mat`, `beta_qtools_mat`, `beta_naive_mat`, `beta_ridge_mat`.
   - Lists of alpha estimates:
     - `alpha_unreg_list`, `alpha_qtools_list`, `alpha_naive_list`, `alpha_ridge_list`.
   - Vectors of LRE values:
     - `LRE_unreg`, `LRE_ridge`, `LRE_qtools`, `LRE_naive`.

6. Summary statistics:
   - Mean RMSE of `beta` for each method vs. the true coefficients `trueBe`.
   - Mean RMSE of `alpha` vs. the true individual effects `alpha_true_id`.
   - Mean out-of-sample LRE for each estimator.

7. Saving the workspace:
   - The script constructs a filename
     ```r
     fname <- sprintf(
       "sim_tau1_%0.2f_tau2_%0.2f_nid_%d_T_%d_rho_%0.2f.RData",
       tau1, tau2, n_id, T_per_id, c
     )
     save.image(file = fname)
     ```
   - This `.RData` file contains all objects created during the session, including:
     - True parameters, simulation settings.
     - Function definitions.
     - Simulation results (beta matrices, alpha lists, LRE vectors).
     - Summary statistics.

---

## 3. Software and dependencies

### R

- R (version ≥ 4.0 recommended).

### R packages

The script uses the following packages:

- **Core modelling and QRR:**
  - `Qtools`
  - `quantreg`
  - `glmnet`
  - `lme4`
  - `MASS`
- **Parallelization:**
  - `foreach`
  - `doParallel`
  - `parallel` (used via `parallel::` calls)
- **Sparse matrices and design matrices:**
  - `Matrix`
- **Miscellaneous (currently minimally or not used in the core simulation, but loaded):**
  - `gee`
  - `dplyr`
  - `tidyr`

Please install them before running the script, for example:

```r
install.packages(c(
  "Qtools", "quantreg", "foreach", "doParallel",
  "glmnet", "lme4", "MASS", "Matrix", 
  "gee", "dplyr", "tidyr"
))
